NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Kinetics Experiment on the Reaction of Coumarin-102 with NaOH Using Smartphone Fluorescence Imaging

https://doi.org/10.1021/acs.jchemed.4c01052

Wink, Donald J; Huma, Loredana C; Demirbuga, Mustafa; Zhang, Hongyang; Jursich, Gregory; Stec, Ewa (March 2025, Journal of Chemical Education)
Holme, Thomas A (Ed.)
This paper reports a laboratory experiment that determines the kinetics of the reaction of coumarin-102, a fluorescent dye, with sodium hydroxide. The reaction is studied by monitoring the loss of coumarin-102 fluorescence during ring-opening saponification by hydroxide using smartphone cameras and near-UV (“blacklight”) illumination. The order of the reaction in coumarin-102 is determined by examining the time course of fluorescence decay over time and fitting the data to integrated rate law models. The order of the reaction in sodium hydroxide is determined by varying the concentration of NaOH and comparing the impact on the rate of the reaction. This represents an easy-to-implement kinetics experiment that uses an engaging phenomenon (fluorescence), a convenient data-acquisition technology (smartphone cameras) and an important image-processing software program (ImageJ). This gives students the ability to work with the determination of the rate law for both coumarin-102 and for sodium hydroxide, using complementary methods. This experiment is both informative and enjoyable for students, enabling them to directly observe kinetics—quite literally with open eyes—making scientific concepts more tangible and engaging. The experiment has also been adapted in a manner consistent with the principles of evidence-centered design using content of kinetics and the scientific practice of mathematical reasoning.
more » « less
Full Text Available
Scalable Fine-tuning from Multiple Data Sources: A First-Order Approximation Approach

Li, Dongyue; Zhang, Ziniu; Wang, Lu; Zhang, Hongyang R (November 2024, Findings of the Association for Computational Linguistics: EMNLP 2024)

We study the problem of fine-tuning a language model (LM) for a target task by optimally using the information from n auxiliary tasks. This problem has broad applications in NLP, such as targeted instruction tuning and data selection in chain-of-thought fine-tuning. The key challenge of this problem is that not all auxiliary tasks are useful to improve the performance of the target task. Thus, choosing the right subset of auxiliary tasks is crucial. Conventional subset selection methods, such as forward & backward selection, are unsuitable for LM fine-tuning because they require repeated training on subsets of auxiliary tasks. This paper introduces a new algorithm to estimate model fine-tuning performances without repeated training. Our algorithm first performs multitask training using the data of all the tasks to obtain a meta initialization. Then, we approximate the model fine-tuning loss of a subset using functional values and gradients from the meta initialization. Empirically, we find that this gradient-based approximation holds with remarkable accuracy for twelve transformer-based LMs. Thus, we can now estimate fine-tuning performances on CPUs within a few seconds. We conduct extensive experiments to validate our approach, delivering a speedup of 30× over conventional subset selection while incurring only 1% error of the true fine-tuning performances. In downstream evaluations of instruction tuning and chain-of-thought fine-tuning, our approach improves over prior methods that utilize gradient or representation similarity for subset selection by up to 3.8%.
more » « less
Full Text Available
Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach

Zhang, Hongyang R; Li, Dongyue; Ju, Haotian (September 2024, Transactions on Machine Learning Research)

The training of over-parameterized neural networks has received much study in recent literature. An important consideration is the regularization of over-parameterized networks due to their highly nonconvex and nonlinear geometry. In this paper, we study noise injection algorithms, which can regularize the Hessian of the loss, leading to regions with flat loss surfaces. Specifically, by injecting isotropic Gaussian noise into the weight matrices of a neural network, we can obtain an approximately unbiased estimate of the trace of the Hessian. However, naively implementing the noise injection via adding noise to the weight matrices before backpropagation presents limited empirical improvements. To address this limitation, we design a two-point estimate of the Hessian penalty, which injects noise into the weight matrices along both positive and negative directions of the random noise. In particular, this two-point estimate eliminates the variance of the first-order Taylor's expansion term on the Hessian. We show a PAC-Bayes generalization bound that depends on the trace of the Hessian (and the radius of the weight space), which can be measured from data. We conduct a detailed experimental study to validate our approach and show that it can effectively regularize the Hessian and improve generalization. First, our algorithm can outperform prior approaches on sharpness-reduced training, delivering up to a 2.4% test accuracy increase for fine-tuning ResNets on six image classification datasets. Moreover, the trace of the Hessian reduces by 15.8%, and the largest eigenvalue is reduced by 9.7% with our approach. We also find that the regularization of the Hessian can be combined with alternative regularization methods, such as weight decay and data augmentation, leading to stronger regularization. Second, our approach remains highly effective for improving generalization in pretraining multimodal CLIP models and chain-of-thought fine-tuning.
more » « less
Full Text Available
Scalable Multitask Learning Using Gradient-based Estimation of Task Affinity

Li, Dongyue; Sharma, Aneesh; Zhang, Hongyang R (September 2024, Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '24))

Full Text Available
Prediction of Halo Coronal Mass Ejections Using SDO/HMI Vector Magnetic Data Products and a Transformer Model

https://doi.org/10.3847/1538-4357/adafa0

Zhang, Hongyang; Jing, Ju; Wang, Jason_T L; Wang, Haimin; Abduallah, Yasser; Xu, Yan; Alobaid, Khalid A; Farooki, Hameedullah; Yurchyshyn, Vasyl (February 2025, The Astrophysical Journal)

Abstract We present a transformer model, named DeepHalo, to predict the occurrence of halo coronal mass ejections (CMEs). Our model takes as input an active region (AR) and a profile, where the profile contains a time series of data samples in the AR that are collected 24 hr before the beginning of a day, and predicts whether the AR would produce a halo CME during that day. Each data sample contains physical parameters, or features, derived from photospheric vector magnetic field data taken by the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory. We survey and match CME events in the Space Weather Database Of Notification, Knowledge, Information and the Large Angle and Spectrometric Coronagraph CME Catalog, and we compile a list of CMEs, including halo CMEs and nonhalo CMEs, associated with ARs in the period between 2010 November and 2023 August. We use the information gathered above to build the labels (positive vs. negative) of the data samples and profiles at hand, where the labels are needed for machine learning. Experimental results show that DeepHalo with a true skill statistic (TSS) score of 0.907 outperforms a closely related long short-term memory network with a TSS score of 0.821. To our knowledge, this is the first time that the transformer model has been used for halo CME prediction.
more » « less
Full Text Available
Learning Tree-Structured Composition of Data Augmentation

Li, Dongyue; Chen, Kailai; Radivojac, Predrag; Zhang, Hongyang R (September 2024, Transactions on Machine Learning Research)

Data augmentation is widely used for training a neural network given little labeled data. A common practice of augmentation training is applying a composition of multiple transformations sequentially to the data. Existing augmentation methods such as RandAugment randomly sample from a list of pre-selected transformations, while methods such as AutoAugment apply advanced search to optimize over an augmentation set of size $k^d$, which is the number of transformation sequences of length $$d$$, given a list of $$k$$ transformations. In this paper, we design efficient algorithms whose running time complexity is much faster than the worst-case complexity of $O(k^d)$, provably. We propose a new algorithm to search for a binary tree-structured composition of $$k$$ transformations, where each tree node corresponds to one transformation. The binary tree generalizes sequential augmentations, such as the SimCLR augmentation scheme for contrastive learning. Using a top-down, recursive search procedure, our algorithm achieves a runtime complexity of $O(2^d k)$, which is much faster than $O(k^d)$ as $$k$$ increases above $$2$$. We apply our algorithm to tackle data distributions with heterogeneous subpopulations by searching for one tree in each subpopulation and then learning a weighted combination, resulting in a \emph{forest} of trees. We validate our proposed algorithms on numerous graph and image datasets, including a multi-label graph classification dataset we collected. The dataset exhibits significant variations in the sizes of graphs and their average degrees, making it ideal for studying data augmentation. We show that our approach can reduce the computation cost by 43% over existing search methods while improving performance by 4.3%. The tree structures can be used to interpret the relative importance of each transformation, such as identifying the important transformations on small vs. large graphs.
more » « less
Full Text Available
A Law of Robustness beyond Isoperimetry

Wu, Yihan; Huang, Heng; Zhang, Hongyang (September 2023, Fortieth International Conference on Machine Learning (ICML 2023))

Full Text Available
Nash Equilibria and Pitfalls of Adversarial Training in Adversarial Robustness Games

Balcan, Maria-Florina; Pukdee, Rattana; Ravikumar, Pradeep; Zhang, Hongyang (April 2023, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics)

Full Text Available
An Analysis of Robustness of Non-Lipschitz Networks

Balcan, Maria-Florina; Blum, Avrim; Sharma, Dravyansh; Zhang, Hongyang (March 2023, Journal of machine learning research)

Full Text Available
Causal Balancing for Domain Generalization

Wang, Xinyi; Saxon, Michael; Li, Jiachen; Zhang, Hongyang; Zhang, Kun; Wang, William Yang (May 2023, Eleventh International Conference on Learning Representations (ICLR 2023))

Full Text Available

« Prev Next »

Search for: All records